Water demand estimation and outlier detection from smart meter data using classification and Big Data methods
نویسندگان
چکیده
Automatic Meter Reading (AMR) systems are being deployed in many cities to obtain insight into the status and the behavior of District Metering Area (DMA) with more granularity. Until now, the water consumption readings of the population were taken one per month or one each two-months. In contrast, AMR systems provide hourly readings for households and more frequent readings for big consumers. On the one hand, this paper aims at predicting water demand and detect suspicious behaviors – e.g. a leak, a smart meter break down or even a fraud – by extracting water consumption patterns. On the other hand, the main contribution of this paper, a software framework, based on Big Data techniques, is presented to tackle the barriers of traditional data storage and data analysis since the volume of AMR data collected by Water Utilities is enormous and it is continuously growing because this technology is expanding
منابع مشابه
Identification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملSecuring Smart Grid In-Network Aggregation through False Data Detection
Existing prevention-based secure in-network data aggregation schemes for the smart grids cannot effectively detect accidental errors and falsified data injected by malfunctioning or compromised meters. In this work, we develop a light-weight anomaly detector based on kernel density estimator to locate the smart meter from which the falsified data is injected. To reduce the overhead at the colle...
متن کاملOutlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus...
متن کاملClustering and Support Vector Regression for Water Demand Forecasting and Anomaly Detection
This paper presents a completely data-driven and machine-learning-based approach, in two stages, to first characterize and then forecast hourly water demand in the short term with applications of two different data sources: urban water demand (SCADA data) and individual customer water consumption (AMR data). In the first case, reliable forecasting can be used to optimize operations, particularl...
متن کاملA statistical test for outlier identification in data envelopment analysis
In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...
متن کامل